index.md (1408B)
1 +++ 2 title = "Huffman codes" 3 +++ 4 5 # Huffman codes 6 7 lossless encoding of characters, the ones that occur more frequently have a shorter encoding 8 9 e.g. encode “mississippi” 10 11 to generate a tree: 12 1. Build a table of values with frequencies: 13 14 <table> 15 <tr><td> m </td><td> i </td><td> s </td><td> p </td></tr> 16 <tr><td> 1 </td><td> 4 </td><td> 4 </td><td> 2 </td></tr> 17 </table> 18 19 1. Repeat until tree is complete: take two smallest nodes from list, create a new node that is sum of two smallest, add the two smallest as children (right > left), then add the parent node back into the list. 20 21    22   23 24 1. Assign weights to left-right branches — left is 0, right is 1. 25 26  27 28 1. Traverse tree to generate a code for each character: 29 30 <table> 31 <th>letter</th><th>code</th> 32 <tr><td> a </td><td> 1100 </td></tr> 33 <tr><td> b </td><td> 1101 </td></tr> 34 <tr><td> c </td><td> 100 </td></tr> 35 <tr><td> d </td><td> 101 </td></tr> 36 <tr><td> e </td><td> 111 </td></tr> 37 <tr><td> f </td><td> 0 </td></tr> 38 </table> 39 40 ## Complexity 41 with a min-heap-based min-priority queue, in O(nlogn)